---
title: "RAG (Retrieval-Augmented Generation) in Mastra | RAG"
description: Overview of Retrieval-Augmented Generation (RAG) in Mastra, detailing its capabilities for enhancing LLM outputs with relevant context.
---

# RAG (Retrieval-Augmented Generation) in Mastra

RAG in Mastra helps you enhance LLM outputs by incorporating relevant context from your own data sources, improving accuracy and grounding responses in real information.

Mastra's RAG system provides:

- Standardized APIs to process and embed documents
- Support for multiple vector stores
- Chunking and embedding strategies for optimal retrieval
- Observability for tracking embedding and retrieval performance

## Example

To implement RAG, you process your documents into chunks, create embeddings, store them in a vector database, and then retrieve relevant context at query time.

```ts showLineNumbers copy
import { embedMany } from "ai";
import { PgVector } from "@mastra/pg";
import { MDocument } from "@mastra/rag";
import { z } from "zod";

// 1. Initialize document
const doc = MDocument.fromText(`Your document text here...`);

// 2. Create chunks
const chunks = await doc.chunk({
  strategy: "recursive",
  size: 512,
  overlap: 50,
});

// 3. Generate embeddings; we need to pass the text of each chunk
import { ModelRouterEmbeddingModel } from "@mastra/core/llm";

const { embeddings } = await embedMany({
  values: chunks.map((chunk) => chunk.text),
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small")
});

// 4. Store in vector database
const pgVector = new PgVector({
  id: 'pg-vector',
  connectionString: process.env.POSTGRES_CONNECTION_STRING,
});
await pgVector.upsert({
  indexName: "embeddings",
  vectors: embeddings,
}); // using an index name of 'embeddings'

// 5. Query similar chunks
const results = await pgVector.query({
  indexName: "embeddings",
  queryVector: queryVector,
  topK: 3,
}); // queryVector is the embedding of the query

console.log("Similar chunks:", results);
```

This example shows the essentials: initialize a document, create chunks, generate embeddings, store them, and query for similar content.

## Document Processing

The basic building block of RAG is document processing. Documents can be chunked using various strategies (recursive, sliding window, etc.) and enriched with metadata. See the [chunking and embedding doc](./chunking-and-embedding).

## Vector Storage

Mastra supports multiple vector stores for embedding persistence and similarity search, including pgvector, Pinecone, Qdrant, and MongoDB. See the [vector database doc](./vector-databases).

## More resources

- [Chain of Thought RAG Example](/examples/v1/rag/usage/cot-rag)
- [All RAG Examples](/examples/v1/) (including different chunking strategies, embedding models, and vector stores)
